Joint Modeling of Text and Acoustic-Prosodic Cues for Neural Parsing

نویسندگان

  • Trang Tran
  • Shubham Toshniwal
  • Mohit Bansal
  • Kevin Gimpel
  • Karen Livescu
  • Mari Ostendorf
چکیده

In conversational speech, the acoustic signal provides cues that help listeners disambiguate difficult parses. For automatically parsing a spoken utterance, we introduce a model that integrates transcribed text and acoustic-prosodic features using a convolutional neural network over energy and pitch trajectories coupled with an attention-based recurrent neural network that accepts text and word-based prosodic features. We find that different types of acoustic-prosodic features are individually helpful, and together improve parse F1 scores significantly over a strong text-only baseline. For this study with known sentence boundaries, error analysis shows that the main benefit of acoustic-prosodic features is in sentences with disfluencies and that attachment errors are most improved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

RNN-based prosodic modeling for mandarin speech and its application to speech-to-text conversion

In this paper, a recurrent neural network (RNN) based prosodic modeling method for Mandarin speech-to-text conversion is proposed. The prosodic modeling is performed in the post-processing stage of acoustic decoding and aims at detecting word-boundary cues to assist in linguistic decoding. It employs a simple three-layer RNN to learn the relationship between input prosodic features, extracted f...

متن کامل

How far can prosodic cues help in word segmentation?

Prosodic cues are of great importance in parsing speech signal into prosodic and lexical units. Listeners detect the changes of the prosodic parameters and interpret them to detect sentence modalities or the mood of the speaker. Some automatic speech recognition systems try to use prosodic parameters to detect boundaries of prosodic units and help thus the acoustic decoding process. Although th...

متن کامل

Using acoustic and prosodic cues to correct Chinese speech repairs

Speech repairs introduce much noise in spoken language processing. Properly correcting speech repairs can help the speech recognizer to avoid the textual errors, and prevent the interpretation errors during the subsequent processing. Because the task of repair processing cannot defer to the latter (word segmentation, part-of-speech tagging and sentence parsing) stages, this paper employs acoust...

متن کامل

Prosodic mapping of text font based on the dimensional theory of emotions: a case study on style and size

Current text-to-speech systems do not support the effective provision of the semantics and the cognitive aspects of the documents’ typographic cues (e.g., font type, style, and size). A novel approach is introduced for the acoustic rendition of text font based on the emotional analogy between the visual (text font cues) and the acoustic (speech prosody) modalities. The methodology is based on: ...

متن کامل

Unsupervised Syntactic Chunking with Acoustic Cues: Computational Models for Prosodic Bootstrapping

Learning to group words into phrases without supervision is a hard task for NLP systems, but infants routinely accomplish it. We hypothesize that infants use acoustic cues to prosody, which NLP systems typically ignore. To evaluate the utility of prosodic information for phrase discovery, we present an HMMbased unsupervised chunker that learns from only transcribed words and raw acoustic correl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1704.07287  شماره 

صفحات  -

تاریخ انتشار 2017